Filename: 137-bootstrap-phases.txt
Title: Keep controllers informed as Tor bootstraps
Author: Roger Dingledine
Created: 07-Jun-2008
Status: Closed
Implemented-In: 0.2.1.x

1. Overview.

  Tor has many steps to bootstrapping directory information and
  initial circuits, but from the controller's perspective we just have
  a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
  slow connections or with connectivity problems can wait a long time
  staring at the yellow onion, wondering if it will ever change color.

  This proposal describes a new client status event so Tor can give
  more details to the controller. Section 2 describes the changes to the
  controller protocol; Section 3 describes Tor's internal bootstrapping
  phases when everything is going correctly; Section 4 describes when
  Tor detects a problem and issues a bootstrap warning; Section 5 covers
  suggestions for how controllers should display the results.

2. Controller event syntax.

  The generic status event is:

    "650" SP StatusType SP StatusSeverity SP StatusAction
                                        [SP StatusArguments] CRLF

  So in this case we send
  650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
  PROGRESS=num TAG=Keyword SUMMARY=String \
  [WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]

  The arguments MAY appear in any order. Controllers MUST accept unrecognized
  arguments.

  "Progress" gives a number between 0 and 100 for how far through
  the bootstrapping process we are. "Summary" is a string that can be
  displayed to the user to describe the *next* task that Tor will tackle,
  i.e., the task it is working on after sending the status event. "Tag"
  is an optional string that controllers can use to recognize bootstrap
  phases from Section 3, if they want to do something smarter than just
  blindly displaying the summary string.

  The severity describes whether this is a normal bootstrap phase
  (severity notice) or an indication of a bootstrapping problem
  (severity warn). If severity warn, it should also include a "warning"
  argument string with any hints Tor has to offer about why it's having
  troubles bootstrapping, a "reason" string that lists one of the reasons
  allowed in the ORConn event, a "count" number that tells how many
  bootstrap problems there have been so far at this phase, and a
  "recommendation" keyword to indicate how the controller ought to react.

3. The bootstrap phases.

  This section describes the various phases currently reported by
  Tor. Controllers should not assume that the percentages and tags listed
  here will continue to match up, or even that the tags will stay in
  the same order. Some phases might also be skipped (not reported) if the
  associated bootstrap step is already complete, or if the phase no longer
  is necessary.  Only "starting" and "done" are guaranteed to exist in all
  future versions.

  Current Tor versions enter these phases in order, monotonically;
  future Tors MAY revisit earlier stages.

  Phase 0:
  tag=starting summary="starting"

  Tor starts out in this phase.

  Phase 5:
  tag=conn_dir summary="Connecting to directory mirror"

  Tor sends this event as soon as Tor has chosen a directory mirror ---
  one of the authorities if bootstrapping for the first time or after
  a long downtime, or one of the relays listed in its cached directory
  information otherwise.

  Tor will stay at this phase until it has successfully established
  a TCP connection with some directory mirror. Problems in this phase
  generally happen because Tor doesn't have a network connection, or
  because the local firewall is dropping SYN packets.

  Phase 10
  tag=handshake_dir summary="Finishing handshake with directory mirror"

  This event occurs when Tor establishes a TCP connection with a relay used
  as a directory mirror (or its https proxy if it's using one). Tor remains
  in this phase until the TLS handshake with the relay is finished.

  Problems in this phase generally happen because Tor's firewall is
  doing more sophisticated MITM attacks on it, or doing packet-level
  keyword recognition of Tor's handshake.

  Phase 15:
  tag=onehop_create summary="Establishing one-hop circuit for dir info"

  Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
  to establish a one-hop circuit for retrieving directory information.
  It will remain in this phase until it receives the CREATED_FAST cell
  back, indicating that the circuit is ready.

  Phase 20:
  tag=requesting_status summary="Asking for networkstatus consensus"

  Once we've finished our one-hop circuit, we will start a new stream
  for fetching the networkstatus consensus. We'll stay in this phase
  until we get the 'connected' relay cell back, indicating that we've
  established a directory connection.

  Phase 25:
  tag=loading_status summary="Loading networkstatus consensus"

  Once we've established a directory connection, we will start fetching
  the networkstatus consensus document. This could take a while; this
  phase is a good opportunity for using the "progress" keyword to indicate
  partial progress.

  This phase could stall if the directory mirror we picked doesn't
  have a copy of the networkstatus consensus so we have to ask another,
  or it does give us a copy but we don't find it valid.

  Phase 40:
  tag=loading_keys summary="Loading authority key certs"

  Sometimes when we've finished loading the networkstatus consensus,
  we find that we don't have all the authority key certificates for the
  keys that signed the consensus. At that point we put the consensus we
  fetched on hold and fetch the keys so we can verify the signatures.

  Phase 45
  tag=requesting_descriptors summary="Asking for relay descriptors"

  Once we have a valid networkstatus consensus and we've checked all
  its signatures, we start asking for relay descriptors. We stay in this
  phase until we have received a 'connected' relay cell in response to
  a request for descriptors.

  Phase 50:
  tag=loading_descriptors summary="Loading relay descriptors"

  We will ask for relay descriptors from several different locations,
  so this step will probably make up the bulk of the bootstrapping,
  especially for users with slow connections. We stay in this phase until
  we have descriptors for at least 1/4 of the usable relays listed in
  the networkstatus consensus. This phase is also a good opportunity to
  use the "progress" keyword to indicate partial steps.

  Phase 80:
  tag=conn_or summary="Connecting to entry guard"

  Once we have a valid consensus and enough relay descriptors, we choose
  some entry guards and start trying to build some circuits. This step
  is similar to the "conn_dir" phase above; the only difference is
  the context.

  If a Tor starts with enough recent cached directory information,
  its first bootstrap status event will be for the conn_or phase.

  Phase 85:
  tag=handshake_or summary="Finishing handshake with entry guard"

  This phase is similar to the "handshake_dir" phase, but it gets reached
  if we finish a TCP connection to a Tor relay and we have already reached
  the "conn_or" phase. We'll stay in this phase until we complete a TLS
  handshake with a Tor relay.

  Phase 90:
  tag=circuit_create "Establishing circuits"

  Once we've finished our TLS handshake with an entry guard, we will
  set about trying to make some 3-hop circuits in case we need them soon.

  Phase 100:
  tag=done summary="Done"

  A full 3-hop circuit has been established. Tor is ready to handle
  application connections now.

4. Bootstrap problem events.

  When an OR Conn fails, we send a "bootstrap problem" status event, which
  is like the standard bootstrap status event except with severity warn.
  We include the same progress, tag, and summary values as we would for
  a normal bootstrap event, but we also include "warning", "reason",
  "count", and "recommendation" key/value combos.

  The "reason" values are long-term-stable controller-facing tags to
  identify particular issues in a bootstrapping step.  The warning
  strings, on the other hand, are human-readable. Controllers SHOULD
  NOT rely on the format of any warning string. Currently the possible
  values for "recommendation" are either "ignore" or "warn" -- if ignore,
  the controller can accumulate the string in a pile of problems to show
  the user if the user asks; if warn, the controller should alert the
  user that Tor is pretty sure there's a bootstrapping problem.

  Currently Tor uses recommendation=ignore for the first nine bootstrap
  problem reports for a given phase, and then uses recommendation=warn
  for subsequent problems at that phase. Hopefully this is a good
  balance between tolerating occasional errors and reporting serious
  problems quickly.

5. Suggested controller behavior.

  Controllers should start out with a yellow onion or the equivalent
  ("starting"), and then watch for either a bootstrap status event
  (meaning the Tor they're using is sufficiently new to produce them,
  and they should load up the progress bar or whatever they plan to use
  to indicate progress) or a circuit_established status event (meaning
  bootstrapping is finished).

  In addition to a progress bar in the display, controllers should also
  have some way to indicate progress even when no controller window is
  open. For example, folks using Tor Browser Bundle in hostile Internet
  cafes don't want a big splashy screen up. One way to let the user keep
  informed of progress in a more subtle way is to change the task tray
  icon and/or tooltip string as more bootstrap events come in.

  Controllers should also have some mechanism to alert their user when
  bootstrapping problems are reported. Perhaps we should gather a set of
  help texts and the controller can send the user to the right anchor in a
  "bootstrapping problems" page in the controller's help subsystem?

6. Getting up to speed when the controller connects.

  There's a new "GETINFO /status/bootstrap-phase" option, which returns
  the most recent bootstrap phase status event sent. Specifically,
  it returns a string starting with either "NOTICE BOOTSTRAP ..." or
  "WARN BOOTSTRAP ...".

  Controllers should use this getinfo when they connect or attach to
  Tor to learn its current state.

